Ab initio calculation of the anomalous Hall conductivity by Wannier interpolation 
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The intrinsic anomalous Hall effect in ferromagnets depends on subtle spin-orbit-induced effects 
in the electronic structure, and recent ab-initio studies found that it was necessary to sample the 
Brillouin zone at millions of fc-points to converge the calculation. We present an efficient first- 
principles approach for computing the anomalous Hall conductivity. We start out by performing 
a conventional electronic-structure calculation including spin-orbit coupling on a uniform and rela- 
tively coarse fc-point mesh. From the resulting Bloch states, maximally-localized Wannier functions 
are constructed which reproduce the ab-initio states up to the Fermi level. The Hamiltonian and 
position-operator matrix elements, needed to represent the energy bands and Berry curvatures, are 
then set up between the Wannier orbitals. This completes the first stage of the calculation, whereby 
the low-energy ab-initio problem is transformed into an effective tight-binding form. The second 
stage only involves Fourier transforms and unitary transformations of the small matrices set up in 
the first stage. With these inexpensive operations, the quantities of interest are interpolated onto 
a dense fe-point mesh and used to evaluate the anomalous Hall conductivity as a Brillouin zone 
integral. The present scheme, which also avoids the cumbersome summation over all unoccupied 
states in the Kubo formula, is applied to bcc Fe, giving excellent agreement with conventional, less 
efficient first-principles calculations. Remarkably, we find that more than 99% of the effect can be 
recovered by keeping a set of terms depending only on the Hamiltonian matrix elements, not on 
matrix elements of the position operator. 

PACS numbers: 71.15.Dx, 71.70.Ej, 71.18.+y, 75.50.Bb, 75.47.-m. 



I. INTRODUCTION 

The Hall resistivity of a ferromagnct depends not only 
on the magnetic induction, but also on the magnetiza- 
tion; the latter dependence is known as the anomalous 
Hall effect (AHE). 1 The AHE is used for investigating 
surface magnetism, and its potential for investigating 
nanoscale magnetism, as well as for magnetic sensors and 
memory devices applications, is being considered. 2 The- 
oretical investigations of the AHE have undergone a re- 
vival in recent years, and have also lead to the proposal 
for a spin counterpart, the spin Hall effect, which has 
subsequently been realized experimentally. 

The first theoretical model of the AHE was put forth 
by Karplus and Luttinger, 3 who showed that it can arise 
in a perfect crystal as a result of the spin-orbit interaction 
of polarized conduction electrons. Later, two alternative 
mechanisms, skew scattering 4 and side jump scattering, 5 
were proposed by Smit and Berger respectively. In skew 
scattering the spin-orbit interaction gives rise to an asym- 
metric scattering cross section even if the defect poten- 
tial is symmetric, and in side-jump scattering the spin- 
orbit coupling causes the scattered electron to acquire an 
extra transverse translation after the scattering event. 
These two mechanisms involve scattering from impuri- 
ties or phonons, while the Karplus-Luttingcr mechanism 
is a scattering-free bandstructure effect. For reasons re- 
lated to the absence of an intuitive physical picture and 
the lack of reliable quantitative estimates based on band- 
structure calculations, the Karplus-Luttinger theory was 
strongly disputed in the early literature. 



In recent years, new insights into the Karplus- 
Luttinger mechanism have been obtained by several 
authors, 7-11 who reexamined it in the modern language of 
Berry's phases. The term H„(k) in the equations below 
was recognized as the Berry curvature of the Bloch states 
in reciprocal space, a quantity which had previously ap- 
peared in the theory of the integer quantum Hall effect, 12 
and also in the Berry-phase theory of polarization. 13 The 
anomalous Hall conductivity (AHC) is simply given as 
the Brillouin zone (BZ) integral of the Berry curvature 
weighted by the occupation factor of each state, 

axy = JY^h £ / dkfn{k) n "'* (k) • (1) 

l ) n JBZ 

While this can be derived in several ways, it is per- 
haps most intuitively understood from the semiclassical 
point of view, in which the group velocity of an electron 
wavepacket in band n is 6,8 

r = l^-kxO„(k). (2) 

The second term, often overlooked in elementary text- 
book derivations, is known as the "anomalous velocity." 
The expression for the current density J then acquires 
a new term e/„(k)k x ll„(k) which, with k = — eE/h, 
leads to Eq. (1). 

Recently, first-principles calculations of Eq. (1) were 
carried out for the ferromagnetic perovskite SrRu03 by 
Fang et al., 14 and for a transition metal, bcc Fe, by Yao 
et al. 15 In both cases the calculated values compared well 
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with experimental data, lending credibility to the intrin- 
sic mechanism. The most striking feature of these cal- 
culations is the strong and rapid variation of the Berry 
curvature in fc-space. In particular, there are sharp peaks 
and valleys at places where two energy bands are split by 
the spin-orbit coupling across the Fermi level. In order 
to converge the integral, the Berry curvature has to be 
evaluated over millions of fc-points in the Brillouin zone. 
In the previous work this was done via a Kubo formula 
involving a large number of unoccupied states; the com- 
putational cost was very high, even for bcc Fe, with only 
one atom in the unit cell. 

In this paper, we present an efficient method for com- 
puting the AHC. Unlike the conventional approach, it 
does not require carrying out a full ab-initio calculation 
for every fc-point where the Berry curvature needs to be 
evaluated. The actual ab-initio calculation is performed 
on a much coarser fc-point grid. By a post-processing 
step, the resulting Bloch states below and immediately 
above the Fermi level are then mapped onto well- localized 
Wannier-functions. In this representation it is then pos- 
sible to interpolate the Berry curvature onto any desired 
fc-point with very little computational effort and essen- 
tially no loss of accuracy. 

The paper is organized as follows. In Sec. II we intro- 
duce the basic definitions and describe the Kubo-formula 
approach used in previous calculations of the intrinsic 
AHC. In Sec. Ill our new Wannier-based approach is 
described. The details of the band-structure calcula- 
tion and Wannier-function construction are described in 
Sec. IV, followed by an application of the method to bcc 
Fc in Sec V. Finally, Sec. VI contains a brief summary 
and discussion. 



II. DEFINITIONS AND BACKGROUND 

The key ingredient in the theory of the intrinsic anoma- 
lous Hall effect is the Berry curvature fi„(k), defined as 



«„(k) = V x A„(k) , 
where A„ is the Berry connection, 

A n (k) = «(u„k|VkKik) • 



(3) 



(4) 



The Berry curvature can be written in an equivalent but 
more explicit form: 



£V 7 (k) = e a/ 3 7 Q„ ;Cl /3(k) 



fi„,a/3(k) = -21m 



du n l 



dk n 



du r 



dk, 



(5) 



(6) 



where the Greek letters indicate Cartesian coordinates, 
e Q/ 3 7 is Levi-Civita tensor and u„k are the cell-periodic 
Bloch functions. The second-rank Berry curvature tensor 
^n, a/3 (k) is introduced for later use. The integral of the 



Berry curvature over a surface bounded by a closed path 
C in fc-space is the Berry phase of that path. 16 

With this notation we rewrite the quantity we wish to 
evaluate, Eq. (1), as 



Tap 



(7) 



where we have introduced the total Berry curvature 

O a/3 (k) /n(k)fin,a/3(k). (8) 

n 

Direct evaluation of Eq. (6) poses a number of practi- 
cal difficulties related to the presence of fc-derivatives of 
Bloch states, as will be discussed in the next section. In 
previous work 1 ' 15 these were circumvented by recasting 
Eq. (6) as a Kubo formula, 12 where the fc-derivatives are 
replaced by sums over states: 



ftn,a/?(k) = -21m ^2 



(k) (k) 

(w m (k) - w„(k)) 2 



(9) 



where w„(k) = £ n u_/h and the matrix elements of the 
Cartesian velocity operators v a = (i/H){H 7 r a ] are given 
by 17 



,, a (k) = (lp n k\Va\lpmk) = -jT 



dH(k) 



dk D 



( 10 ) 

where H (k) = e~* k r H e* k r . The merit of Eq. (9) lies in 
its practical implementation on a finite fc-grid using only 
the wave functions at a single fc-point. As is usually the 
case for such linear-response formulas, sums over pairs of 
occupied states can be avoided in the T = version of 
the formula (8) for the total Berry curvature, 



(k) v cv .p(k) 
Mk)-^(k)) 2 



(11) 



where v and c subscripts denote valence (occupied) and 
conduction (unoccupied) bands, respectively. However, 
the evaluation of this formula requires the cumbersome 
summation over unoccupied states. Even if practical cal- 
culations truncate the summation to some extent, the 
computation could be time-consuming. Moreover, the 
time required to calculate the matrix elements of the ve- 
locity operator in Eq. (9) or (11) is not negligible. 



III. EVALUATION OF THE BERRY 
CURVATURE BY WANNIER INTERPOLATION 

In view of the above-mentioned drawbacks of the Kubo 
formula for practical calculations, it would be highly de- 
sirable to have a numerical scheme based on the the "ge- 
ometric formula" (6), in terms of the occupied states 
only. The difficulties in implementing that formula arise 
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FIG. 1: Band structure of bcc Fe with spin-orbit coupling 
included. Solid lines: original band structure from a con- 
ventional first-principles calculation. Dotted lines: Wannier- 
interpolated band structure. The zero of energy is the Fermi 
level. 



form the fc-derivatives therein. Since in practice one al- 
ways replaces the Brillouin zone integration by a dis- 
crete summation, an obvious approach would be to use 
a finite-difference representation of the derivatives on 
the fc-point grid. However, this requires some care: a 
straightforward discretization will yield results which de- 
pend on the choice of phases of the Bloch states (i.e., 
the choice of gauge), even though Eq. (6) is in princi- 
ple gauge- invariant. The problem becomes more acute 
in the presence of band crossings and avoided crossings, 
because then it is not clear which two states at neigh- 
boring grid points should be taken as "partners" in a 
finite-differences expression. (Moreover, since the sys- 
tem is a metal, at T = the occupation can be different 
at neighboring fc-points.) Successful numerical strategies 
for dealing with problems of this nature have been devel- 
oped in the context of the Berry-phase theory of polariza- 
tion of insulators, and a workable finite-difference scheme 
which combines those ideas with Wannier interpolation 
is sketched in Appendix B. 

We present here a different, more powerful strategy 
that also relies on a Wannier representation of the low- 
energy electronic structure. We will show that it is possi- 
ble to express the needed derivatives analytically in terms 
of the Wannier functions, so that no finite-difference eval- 
uation of a derivative is needed in principle. The use of 
Wannier functions allows us to achieve this while still 
avoiding the summation over all empty states which ap- 
pears in the Kubo formula as a result of applying con- 
ventional k ■ p perturbation theory. 



A. Wannier representation 

We begin by using the approach of Souza, Marzari, 
and Vanderbilt 18 to construct a set of Wannier functions 



(WFs) for the metallic system of interest. For insulators, 
one normally considers a set of WFs that span precisely 
the space of occupied Bloch states. Here, since we have a 
metallic system and we want to have well-localized WFs, 
we choose a number of WFs larger than the number of 
occupied states at any k, and only insist that the space 
spanned by the WFs should include, as a subset, the 
space of the occupied states, plus the first few empty 
states. Thus, these partially-occupied WFs will serve 
here as a kind of "exact tight-binding basis" that can 
be used as a compact representation of the low-energy 
electronic structure of the metal. 

This is illustrated in Fig. 1, where the bandstructure 
of bcc Fe is shown. The details of the calculations will be 
presented later in Sec. IV. The solid lines show the full 
ab-initio bandstructure, while the dashed lines show the 
bands obtained within the Wannier representation using 
M = 18 WFs per cell (9 of each spin; see Sec. IV B). 
In the method of Ref. 18, one specifies an energy i? w i n 
lying somewhat above the Fermi energy E{, and insists 
on finding a set of WFs spanning all the states in an 
energy window up to i? w in- In the calculation of Fig. 1 
we chose E w i n ~ 18 eV, and it is evident that there is 
an essentially perfect match between the fully ab-initio 
and the Wannier-represented bands up to, but not above, 

TP. . 

More generally, we shall assume that we have M WFs 
per unit cell (where M > everywhere in the BZ) such 
that the Bloch-like functions given by the phased sum of 
the Wannier orbitals, 

l«SF> = E e " <k ' ( '" R) |R»> ( 12 ) 

R 

(n = 1, ...,M), span the actual Bloch eigenstates u„t) 
of interest (n = 1, Nk) at each k. It follows that, if we 
construct the M x M Hamiltonian matrix 

H™<M = (ig>\HQc)\u™) (13) 

and diagonalize it by finding an M x M unitary rotation 
matrix U(k) such that 

U Jf (k)H ( - w ^(k)U(k)=H^\k) (14) 

where ffi^(k) = £^5 nm , then £^ will be identical 
to the true £ n k for all occupied bands. Also, the corre- 
sponding Bloch states 

I^HEl^KWk) (15) 

m 

will also be identical to the true eigenstates \u n k) for 
£ < Ef. (In the scheme of Ref. 18, these properties will 
actually hold for energies up to -E w in-) However, the band 
energies and Bloch states will not generally match the 
true ones at higher energies, as shown in Fig. 1. We thus 
use the superscript 'H' to distinguish the projected band 
energies £^ and eigenvectors \u^) from the true ones 
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£nk and |w„k), keeping in mind that this distinction is 
only significant in the higher-energy unoccupied region 
{£ > E w i n ) of the projected bandstructure. 

The unitary rotation of states expressed by the matrix 
U (k) is often referred to as a "gauge transformation," and 
we shall adopt this terminology here. We shall refer to 
the Wannicr-derived Bloch-like states ) as belonging 

to the Wannier (W) gauge, while the eigenstates |«„ k ) 
of the projected bandstructure are said to belong to the 
Hamiltonian (H) gauge. 

Quantities such as the Berry connection A„(k) of 
Eq. (4) and the Berry curvature 0„ ia/ g(k) of Eq. (3) 
clearly depend upon the gauge in which they are ex- 
pressed. The quantity that we wish to calculate, Eq. (8), 
is most naturally expressed in the Hamiltonian gauge, 
where it takes the form 



M 



n^(k) = ^/„(k)o2 /3 (k) 



(16) 



n=l 



( R") 

Here Q^^Jk) is given by Eq. (6) with |w nk ) 



n,ct(3 



,(h; 



). It 

is permissible to make this substitution because the pro- 
jected bandstructure matches the true one for all occu- 
pied states. In practice one may take for the occupation 
factor /„(k) = 9(Ef — £„k) or introduce a small thermal 
smearing as desired. 

Our strategy now is to see how the right-hand side 
of Eq. (16) can be obtained by starting with quanti- 
ties that are defined and computed first in the Wan- 
nier gauge and then transformed into the Hamiltonian 
gauge. The resulting scheme can be viewed as a gen- 
eralized Slater-Koster interpolation, which takes advan- 
tage of the smoothness in fe-space of the Wannier-gauge 
objects, a direct consequence of the short range of the 
Wannier orbitals in real space. 



B. Gauge transformations 

Because the gauge transformation of Eq. (15) involves 
a unitary rotation among several bands, it is useful to 
introduce generalizations of the quantities in Eqs. (3-4) 
having two band indices instead of one. Thus, we define 



t (k) = i(u n \d a u m ) 



(17) 



and 



^nm,a/?(k) = i{d a u n \dpu m ) - i(dpu n \d a u m ) , (18) 

where every object in each of these equations should con- 
sistently carry either a (W) or (H) label. (We have now 
suppressed the k subscripts and introduced the notation 
d a = d/dk a for conciseness.) In this notation, Eq. (16) 
becomes 

M 

^(k) = E/«( k )^S«M k )- ( 19 ) 

n=l 



This matrix is antisymmetric in the Cartesian indices. 
Note that when Q, a p appears without a (W) or (H) su- 
perscript, as on the left-hand side of this equation, it 
denotes the total Berry curvature on the left-hand side 
of Eq. (16). 

The matrix representation of an ordinary operator 
such as the Hamiltonian or the velocity can be trans- 
formed from the Wannier to the Hamiltonian gauge, or 
vice versa, just by operating on the left and right by 
c^(k) and U(k), as in Eq. (14); such a matrix is called 
"gauge-covariant." Unfortunately, the matrix objects in 
Eqs. (17-18) are not gauge-covariant, because they in- 
volve fc-derivatives acting on the Bloch states. For ex- 
ample, a straightforward calculation shows that 



A™ = U*AWu + itfd a U 



(20) 



where each object is an M x M matrix and matrix prod- 
ucts are implied throughout. For every matrix object O, 
we define 



^(H) = rfQWu 



(21) 



,(H) 



so that, by definition, O x "' = only for gauge- 

covariant objects. 

The derivative d a U may be obtained from ordinary 
perturbation theory. We adopt a notation in which \\4>m)) 
is the m-th M-component column vector of matrix U , so 
that ((0 n ||if( w )||(/) m )) = £ n 8 nm ; the stylized bra-ket no- 
tation is used to emphasize that objects like £f( w ) and 
\\4> n )) are M x M matrices and M-component vectors, 
i.e., operators and state vectors in the "tight-binding 
space" defined by the WFs, not in the original Hilbcrt 
space. Perturbation theory with respect to the parame- 
ter k takes the form 



\\d a <f 



k\\H a W) \\<j> r , 



c(H) _ c 



(H) 



(22) 



where H a W ^ = d a H^ w K In matrix notation this can be 
written 



da U mn = J2u m iD^ a = (UD^) 



(23) 



where 



DW a = (U^d a U) r 



-77(H) 

nm,a 
^(H) _ ^(H) 



if n ^ m 



if n = m 



(24) 



and H 



■77(H) 



(WH a W) U) nm according to Eq. (21). Note 
that while Q a p and A a arc Hcrmitian in the band indices, 

D a is instead antihermitian. The gauge choice implicit 
in Eqs. (22) and (24) is ((^ n \\d a (f) n )) = (U^d a U) nn = 
(this is the so-called "parallel transport" gauge). 
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Using Eq. (23), Eq. (20) becomes 

A™=1™+iD™ (25) 
and the derivative of Eq. (15) becomes 

\d a u^) = \d a uM)U mn + \^)DW a ■ (26) 

m m 

Plugging the latter into Eq. (18), we finally obtain, after 
a few manipulations, the matrix equations 

«S = ^-[^4 H) ] 

+ [Df\A^]-i[D^,D^}. (27) 

(Hi 

The band-diagonal elements Sl y nn Q/3 (k) then need to be 
inserted into Eq. (19). 

C. Discussion 

We expect, based on Eq. (9), that the largest contribu- 
tions to the AHC will come from regions of fc-space where 
there are small energy splittings between bands (for ex- 
ample, near spin-orbit-split avoided crossings). 14 In the 
present formulation, this will give rise to small energy 

( HI 

denominators in Eq. (24), leading to very large D a val- 
ues in those regions. These large and spiky contributions 



will then propagate into and , whereas A 

and Q^P , and also A^ and , will remain with their 
typically smaller values. Thus, these spiky contributions 
will be present in the second and third terms, and espe- 
cially in the fourth term, of Eq. (27). The contributions 
of these various terms are illustrated for the case of bcc 
Fe in Sec. V A, and we show there that the last term typ- 
ically makes by far the dominant contribution, followed 
by the second and third terms, and then by the first term. 

The dominant fourth term can be recast in the form 
of a Kubo formula as 



>(H) 



,(W) 



-21m £ ■ 



t>nk 



\H, 



(W), 



(H) 



(H) 



(28) 



The following differences between this equation and the 
true Kubo formula, Eq. (9), should however be kept 
in mind. First, the summation in Eq. (28) is re- 
stricted to the M-band projected band structure. Sec- 
ond, above E w i n the projected bandstructure deviates 
from the original ab-initio one. Third, even below E w - ln , 
where they do match exactly, the "tight-binding veloc- 
ity matrix elements" appearing in Eq. (28) differ from 
the ab-initio ones, given by Eq. (10). (The relation be- 
tween them is particularly simple within the inner win- 
dow, and follows from combining the identity A nm ^ a — 
i^r^Vo^ipm) I {u m — Wn)i valid for m ^ n, with Eqs. (24- 
25).) All these differences are however exactly compen- 
sated by the previous three terms in Eq. (27). We empha- 
size that all terms in that equation are defined strictly 



within the projected space spanned by the Wannier func- 
tions. 

We note in passing that it is possible to rewrite Eq. (27) 
in such a way that the large spiky contributions are iso- 
lated into a single term. This alternative formulation, 
which turns out to be related to a gauge-covariant cur- 
vature tensor, will be described in Appendix A. 



D. Sum over occupied bands 



)(H) 



In the above, we have proposed to compute £l nn a ^ 
from Eq. (27) and insert it into the band sum, Eq. (19), in 
order to compute the AHC. However, this approach has 
a shortcoming in that small splittings (avoided crossings) 
between a pair of occupied bands n and to leads to large 
values of Dnm )Q , and thus to large but canceling contri- 
butions to the AHC coming from 0„„ ia/3 and Q^^p- 
Here, we rewrite the total Berry curvature (19) in such a 
way that the cancellation is explicit. 

Inserting Eq. (27) into Eq. (19) and interchanging 
dummy labels n«min certain terms, we obtain 



n 

"I" ^ ^ (fm ~ In) (j^nrn,a^-rnn,l3 
nm 

The factors of (f m — /„) insure that terms arising from 
pairs of fully occupied states give no contribution. Thus, 
the result of this reformulation is that individual terms 
in Eq. (29) have large spiky contributions only when 
avoided crossings or near-degeneracies occur across the 
Fermi energy. This approach is therefore preferable from 
the point of view of numerical stability, and it is the for- 
mula that we have implemented in the current work. 

As expected from the discussion in Sec. IIIC and 
shown later in Sec. VB, the dominant term in Eq. (29) 
is the last one, 



(H) 

mn,/3 



(30) 



or, in a more explicitly Kubo-like form, 



(H) 
1 nm,a J 



f (H) 

■ mn,/3 



(H)\ 



(31) 



In the zero-temperature limit, the latter can easily be 
cast into a form like Eq. (28), but with the a double sum 
running over occupied bands n and unoccupied bands 
to, very reminiscent of the original Kubo formula in 

Eq. (11). We remark that (m e /T%)H nm a coincides with 
the "effective tight-binding momentum operator" defined 
in Ref. 19. 
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It is worth pointing out that Eq. (28) can be cast ex- 
plicit as a Berry curvature, the tight-binding-space ana- 
log of Eq. (6), 

^n,a/3 = -21m ({d a <t> nk \\d0(t>nk} ■ (32) 

In this way Eq. (31) can be written in a form that closely 
resembles the total Berry curvature, Eq. (16): 

M 

— X] f n ^n.a(3 ■ (33) 
n=l 

E. Evaluation of the Wannier-gauge matrices 

Eq. (29) is our primary result. To review, recall that 
this is a condensed notation expressing the M x M m&- 

trix n< n2., a p( k ) in terms of matrices the n^ Q/3 (k), etc. 
The basic ingredients needed are the four matrices H^ w \ 
i/a W \ and fi^jp at a given k. Diagonalization of 

the first of them yields the energy eigenvalues needed to 
find the occupation factors /„. It also provides the gauge 

(H) 

transformation U which is then used to construct H a , 

A a , and £l a g from the other three objects via Eq. (21). 

Finally, is inserted into Eq. (24) to obtain 

and all terms in Eq. (29) are evaluated. 

In this section we explain how to obtain the matrices 

ff( w )(k), #i W) (k), A^ W) (k) and fl ( ™\k) at an arbitrary 
point k for use in the subsequent calculations described 
above. 

1. Fourier transform expressions 

The four needed quantities can be expressed as follows: 
^k)=£ e ik - R (On|i?|Rm>, (34) 

R 

- E £lk ' R iR « (On\H\Hm) , (35) 

R 

^ tt (k)=E e ^ R (On|f Q |Rm), (36) 

R 

n2 a/3 (k) - (ziMOnMRm) 

R 

-iRp (On\r a \Rmf) . (37) 

(The notation |0n) refers to the n'th WF in the home unit 
cell R = 0.) Eq. (34) follows by combining Eqs. (12) and 
(13), while Eq. (36) follows by combining Eqs. (12) and 
(17). Eqs. (35) and (37) are then obtained from (34) and 



(36) using H nm ^ a = d a H nm and f2 nm , Q/3 = d a A nm ^ - 
dpAnm^a, respectively. 

It is remarkable that the only real-space matrix ele- 
ments that are required between WFs are those of the 
four operators H and r a (a = x, y, and z). Because 
the WFs are strongly localized, these matrix elements 
are expected to decay rapidly as a function of lattice vec- 
tor R, so that only a modest number of these real-space 
matrix elements need to be computed and stored once 
and for all. Collectively, they define our "exact tight- 
binding model" and suffice to allow subsequent calcu- 
lation of all needed quantities. Furthermore, the short 
range of these matrix elements in real space insures that 
the Wannier-gauge quantities on the left-hand sides of 
Eqs. (34-37) will be smooth functions of k, thus justifying 
the earlier discussion in which it was argued that these 
objects should have no rapid variation or enhancement 
in /c-space regions where avoided crossings occur. (Re- 
call that such large, rapidly-varying contributions only 
appear in the matrices and in quantities that de- 

pend upon them.) It should however be kept in mind that 
Eq. (29) is not written directly in terms of the smooth 
quantities (34-37), but rather in terms of those quantities 
transformed according to Eq. (21). The resulting objects 
are not smooth, since the matrices U change rapidly with 
k. However, even while not smooth, they remain small. 



2. Evaluation of real-space matrix elements 

We conclude this section by discussing the calcula- 
tion of the fundamental matrix elements (0n\H\Hm,) 
and (0n|f Q |Rm). There are several ways in which these 
could be computed, and the choice could well vary from 
one implementation to another. One possibility would 
be to construct the WFs in real space, say on a real- 
space grid, and then to compute the Hamiltonian and 
position-operator matrix elements directly on that grid. 
In the context of a code that uses a real-space basis 
(e.g., localized orbitals or grids), this might be the best 
choice. However, in the context of plane-wave methods 
it is usually more convenient to work in reciprocal space 
if possible. This is in the spirit of the Wannicr-function 
construction scheme, 18,20 which is formulated as a post- 
processing step after a conventional ab-initio calculation 
carried out on a uniform /c-point grid. (In the follow- 
ing we will use the symbol q to denote the points of 
this ab-initio mesh, to distinguish them from arbitrary 
or interpolation-grid points denoted by k.) 

The end result of the Wannier-construction step are 
M Bloch-like functions \u^} at each q. The WFs are 
obtained from them via a discrete Fourier transform: 

iR»> = 4£«-*- (R - f W>- (38) 

1 q 

This expression follows from inverting Eq. (12). If the 
ab initio mesh contains N q x N q x N q points, the re- 
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suiting WFs are really periodic functions over a super- 
ccll of dimensions L x L xL, where L = N q a and a is 
the lattice constant of the unit cell. The idea then is 
to choose L large enough that the rapid decay of the 
localized WFs occurs on a scale much smaller than L. 
This ensures that the matrix elements (On\H\Rm) and 
(On\f a \TLm) between a pair of WFs separated by more 
than L/2 are negligible, so that further refinement of 
the ab-initio mesh will have a negligible impact on the 
accuracy of Wannicr-interpolatcd quantities. (In partic- 
ular, the interpolated band structure, Fig. 1, is able to 
reproduce tiny features of the full bandstructure, such 
as spin-orbit-induced avoided crossings, even if they oc- 
cur on a length scale much smaller than the ab-initio 
mesh spacing.) While the choice of reciprocal-space cell 
spanned by the vectors q is immaterial, because of the 
periodicity of reciprocal space, this is not so for the vec- 
tors R. In practice we choose the N q x N q x N q vectors 
R to be evenly distributed on the Wigner-Seitz supercell 
of volume N q a 3 centered around R = 0. 18 This is the 
most isotropic choice possible, ensuring that the strong 
decay of the matrix elements for |R| ~ L/2 is achieved 
irrespective of direction. 

The matrix elements of the Hamiltonian are obtained 
from Eq. (38) as 

(On\H\Bm) = ^ E ^^^^(q) , (39) 
« q 

which is the reciprocal of Eq. (34), with the sum running 
over the coarse ab-initio mesh. The position matrix is 
obtained similarly by inverting Eq. (36): 



(On|f a |Rm) = 




The matrix A^} tCC (q) is then evaluated by approximating 
the /c-derivatives in Eq. (17) by finite-differences on the 
ab-initio mesh using the expression 20 

4$ a (q) * ^ J2 w ^ (« } l«K+b> - S nm) , (41) 

b 

where b are the vectors connecting q to its nearest neigh- 
bors on the ab-initio mesh. This approximation is valid 
because the Bloch states vary smoothly with k in the 
Wannier gauge. We note that the overlap matrices ap- 
pearing on the right-hand side are available "for free" as 
they have already been computed and stored during the 
WF construction procedure. This is also the case for the 
matrices #( w )(q) needed in Eq. (39). 

It should be noted that the /c-space finite- difference 
procedure outlined above entails an error of order 0(Aq 2 ) 
in the values of the position operator matrix elements, 
where Aq is the ab-initio mesh spacing. The importance 
of such an error is easily assessed by trying denser g-point 
meshes; in our case, we find that it is not a numerically 
significant source of error for the 8x8x8 mesh that we 



employ in our calculations. (In large measure this is sim- 
ply because less than 0.5% of the total AHC comes from 
terms that depend on these position-operator matrix ele- 
ments, as will be discussed in Section V. Indeed, we find 
that the 0(Aq 2 ) convergence of this small contribution 
hardly shows in the convergence of the total AHC, which 
empirically appears to be approximately exponential in 
the ab-initio mesh density.) However, if the 0(Aq 2 ) con- 
vergence is a source of concern, one could adopt the direct 
real-space mesh integration method mentioned at the be- 
ginning of this subsection, which should be free of such 
errors. 



IV. COMPUTATIONAL DETAILS 

In this section we present some of the detailed steps of 
the calculations as they apply to our test system of bec 
Fe. First, we describe the first-principles bandstructure 
calculations that are carried out initially. Second, we 
discuss the procedure for constructing maximally local- 
ized Wannier functions for the bands of interest following 
the method of Souza, Marzari, and Vanderbilt. 18 Third, 
we discuss the variable treatment of the spin-orbit in- 
teraction within these first-principles calculations, which 
is useful for testing the dependence of the AHC on the 
spin-orbit coupling. 

A. Band structure calculation 

Fully rclativistic band structure calculations for bec 
Fe in its ferromagnetic ground state at the experimental 
lattice constant a = 5.42 Bohr are carried out using the 
PWSCF code. 21 A kinetic-energy cutoff of 60 Hartree is 
used for the planewave expansion of the valence wave- 
functions (400 Hartree for the charge densities). Ex- 
change and correlation effects are treated with the PBE 
generalized-gradient approximation. 22 

The core-valence interaction is described here by 
means of norm-conserving pseudopotentials which in- 
clude spin-orbit effects 23,24 in separable Klcinman- 
Bylander form. (Our overall Wannier interpolation ap- 
proach is quite independent of this specific choice and 
can easily be generalized to other kinds of pseudopoten- 
tials or to all-electron methods.) The pseudopotential 
was constructed using a reference valence configuration of 
3d 7 4s°- 75 4p - 25 . We treat the overlap of the valence states 
with the semicore 3p states using the non-linear core cor- 
rection approach. 25 The pseudopotential core radii for 
the 3d, 4s and 4p states are 1.3, 2.0 and 2.2 Bohr, respec- 
tively. We find the small cut-off radius for the 3d chan- 
nel to be necessary in order to reproduce the all-electron 
bandstructure accurately. 

We obtain the self-consistent ground state using a 
16 x 16 x 16 Monkhorst-Pack 26 mesh of /c-points and a fic- 
titious Fermi smearing 27 of 0.02 Ry for the Brillouin-zone 
integration. The magnetization is along the [001] direc- 



tion, so that the only non-zero component of the total 
Berry curvature is the one along z. The spin magnetic 
moment is found to be 2.22 /ib, the same as that from an 
all-electron calculation 15 and close to the experimental 
value of 2.12 /ib- 

In order to calculate the Wannier functions, we freeze 
the self-consistent potential and perform a non-self- 
consistent calculation on a uniform n x n x n grid of 
fc-points. We tested several grid densities ranging from 
12=4 to ?i=10 and ultimately chose n=8 (see end of next 
subsection). Since we want to construct 18 WFs (s, p, 
and d-like for spin up and down), we need to include 
a sufficient number of extra bands to cover the orbital 
character of these intended WFs everywhere in the Bril- 
louin zone. With this in mind, we calculate the first 28 
bands at each fc-point, and then exclude any bands above 
58 eV (the "outer window" of Ref. 18). The 18 WFs are 
then disentangled from the remaining bands using the 
procedure described in the next section. 



B. Maximally- localized spinor Wannier functions 
for bec Fe 



The energy bands of interest (extending up to, and 
just above, the Fermi energy) have mainly mixed s and d 
character and are entangled with the bands at higher en- 
ergies. In order to construct maximally-localized WFs to 
describe these bands, we use a two-step post-processing 
procedure 18 as implemented in the WANNIER90 code. 28 
In the first ("disentangling") step, an 18-band subspace 
(the "projected space") is identified that minimizes the 
invariant part of the spread functional, subject to the 
constraint of including the states within an inner energy 
window. 18 We chose this window to span an energy range 
of 30 eV from the bottom of the valence bands (up to -Ewin 
in Fig. 1). In the second step, a set of maximally-localized 
WFs spanning this subspace is chosen by minimizing the 
gauge-dependent part of the spread functional. 20 

Although the original prescription for obtaining 
maximally-localized Wannier functions was formulated 
for the spinless case, it is trivial to adapt it to treat spinor 
wavef unctions, in which case the resulting WFs also have 
spinor character: each element of the overlap matrix, 
which is the key input to the WF-generation code, is 
simply calculated as the sum of two spin components, 



'k.b 



,k+b 



(42) 



where \u° k+b ) is one of the two components of the spinor 
wavefunction. 

In order to facilitate later analysis (e.g., of the orbital 
and spin character of various bands), we have modified 
the second step as follows. At each fc-point on the 8x8x8 
mesh, we form the 18x18 matrix representation of the 
spin operator S z = (h/2)<j z in the space of band states 
and diagonalize it. The 18-dimensional space at this fc- 
point is then divided in two 9-dimensional subspaces, a 




FIG. 2: (Color online). Isosurface contours of maximally- 
localized spin-up WF in bec Fe (red for positive value and 
blue for negative value), for the 8x8x8 fc-point sampling, 
(a) sp 3 d 2 -like WF centered on a Cartesian axis; (b) d xy -\ike 
WF centered on the atom. 



mostly spin-up subspace spanned by the eigenstates hav- 
ing S z eigenvalues close to +1, and a mostly spin-down 
subspace associated with eigenvalues close to —1 (we will 
use units of Ti/2 whenever we discuss S z in the remain- 
der of the manuscript). The spread functional is then 
minimized within each of these subspaces separately. We 
thus emerge with 18 well-localized WFs divided into two 
groups: nine that are almost entirely spin-up and nine 
that are almost entirely spin-down (in practice we find 
\(S Z }\ > 0.999 in all cases). While this procedure re- 
sults in a total spread that is slightly greater than would 
be obtained otherwise, we find that the difference is very 
small in practice, and the imposition of these rules makes 
for a much more transparent analysis of subsequent re- 
sults. For example, it makes it much easier to track the 
changes in the WFs before and after the spin-orbit cou- 
pling is turned on, or to identify the spin character of 
various pieces of the Fermi surface. 

To start the minimization procedure, we choose trial 
functions having pure spin character (up or down) and a 
spatial form of a Gaussian times a predetermined angu- 
lar factor. In our first attempts, we chose angular factors 
appropriate for the three ti g states d xy , d xz , and d yz ; the 
two e g states d z i and d x 2_ y 2; the three p states p x , p y , 
and p z ; and s. The iterative procedure 18 then projects 
these onto the band subspace and improves upon them. 
We found that the spread minimization procedure con- 
verted the t2 g trial functions into f 23-like WFs, while it 
mixed the other six states to form six hybrid WFs of 
sp 3 d 2 -type. 29 Having discovered this, we have modified 
our procedure accordingly: henceforth, we choose three 
t2 g -like trial functions and six sp 3 d 2 -\ike ones. With 
this initialization, we find the convergence to be quite 
rapid, with only about 100 iterations needed to get a 
well-converged spread functional. 

The WFs that result from this procedure are shown 
in Fig. 2. The up-spin WFs are plotted, but the WFs 
are very similar for both spins. An example of an sp 3 d 2 - 
hybrid WF is shown in Fig. 2(a); this one extends along 
the —a; axis, and the five others are similarly projected 
along the +x, ±y, and ±z axes. One of the i2 ff -like WFs 
is shown in Fig. 2(b); this one has xy symmetry, while 
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the others have xz and yz symmetry. The centers of the 
sp 3 d 2 -\ike WFs are slightly shifted from the atomic cen- 
ter along ±x, ±y, or ±z, while the £2g-like WFs remain 
centered on the atom. 

We studied the convergence of the WFs and interpo- 
lated bands as a function of the density n x n x n of the 
Monkhorst-Pack fc-mesh used for the initial ab-initio cal- 
culation. We tested n = 4, 6, 8, and 10, and found that 
n = 8 provided the best tradeoff between interpolation 
accuracy and computational cost. This is the mesh that 
was used in generating the results presented in Sec. V. 



position indices have been suppressed. The are 
radial functions multiplied by appropriate spin-angular 
functions and the Dy are the channel weights. We in- 
troduce the notation j3[ + \r) and (3[ \r) for the radial 
parts of |A,z+i/2, M ) and A,z-i/2,/x), respectively, and simi- 
larly define = A,z±i/2- Using this notation, we can 
define the scalar-relativistic (i.e., j-averaged) quantities 



Df 



l + l 
21 + 1 



D 



(+) 



I 



21 + 1 



D 



(-) 



(45) 



C. Variable spin-orbit coupling in the 
pseudopotential framework 

Since the AHE present in ferromagnetic iron is a spin- 
orbit-induced effect, it is obviously important to under- 
stand the role of this coupling as thoroughly as possible. 
For this purpose, it is very convenient to be able to treat 
the strength of the coupling as an adjustable parame- 
ter. For example, by turning up the spin-orbit coupling 
continuously from zero and tracking how various contri- 
butions to the AHC behave, it is possible to separate 
out those contributions that are of linear, quadratic, or 
higher order in the coupling strength. Some results of 
this kind will be given later in Sec. V. 

Because the spin-orbit coupling is a relativistic effect, 
it is appreciable mainly in the core region of the atom 
where the electrons have relativistic velocities. In a pseu- 
dopotential framework of the kind adopted here, both the 
scalar relativistic effects and the spin-orbit coupling are 
included in the pseudopotential construction. For exam- 
ple, in the Bachclct-Hamann scmilocal pseudopotential 
scheme, 30 the construction procedure generates, for each 
orbital angular momentum /, a scalar-relativistic poten- 
tial VJ sr (r) and a spin-orbit difference potential VJ so (r) 
which enter the Hamiltonian in the form 

V ps = J2 P i [VT(r) + AVT(r)L.S] , (43) 
i 

where Pi is the projector onto states of orbital angular 
momentum / and A controls the strength of spin-orbit 
coupling (with A=l being the physical value). For the 
free atom, this correctly leads to eigenstates labeled by 
total angular momentum j = I ±1/2. 

In our calculations, we employ fully non-local pseu- 
dopotentials instead of semilocal ones because of their 
computationally efficient form. In this case, controlling 
the strength of the spin-orbit coupling requires some alge- 
braic manipulation. We write the norm-conserving non- 
local pseudopotential operator as 

V V s = \Plj l *)Dlj(0U l >\ (44) 

where there is an implied sum running over the indices 
(orbital angular momentum /, total angular momentum 
j = I ± 1/2, and ji — — j, j) and species and atomic 



(46) 

and the corresponding spin-orbit difference quantities 

D% = D tj - Df , (47) 

IA S P = IAi M > ~ IA S P • (48) 

where is /?f r (r) multiplied by the spin- angular func- 

tion with labels (Ij/J,). Then the non-local pseudopoten- 
tial can be written as 

V ps = V s " + A V so (49) 

where 

V sr = \f3^)Dr(f3?l\ (50) 

and 

Vso - |/3^)A S °(/3^| 

+ \^)(Dr+D%)(pr jtt \ 

+ \0^)(Dr + D!°)(l3?^\. (51) 

This clearly reduces to the desired results (44) for A = 1 
and (50) for A = 0. 

V. RESULTS 

In this section, we present the results of the calcula- 
tions of the Berry curvature and its integration over the 
BZ using the formulas presented in Sec. Ill, for the case 
of bec Fe. 



A. Berry Curvature 

We begin by illustrating the very sharp and strong 
variations that can occur in the total Berry cur- 
vature, Eq. (8), near Fermi-surface features in the 
bandstructure. In Fig. 3(a) we plot the energy bands 
(top subpanel) and the total Berry curvature (bottom 



10 



-0.4 



: 2000 









(a) ; 












J 


1 




I 


I 



(0,0,0) 
0.6 
0.4 



(1,0,0) 



% 0.2 

M 

1-0-2 
-0.4 

% 

| 4000 
J 2000 



y / / 






(b) ; 










* 









r 

(0,0,0) 



H 

(1,0,0) 



p 

(Vi.'/j.'/z) 



FIG. 3: Band structure and total Berry curvature, as cal- 
culated using Wannier interpolation, plotted along the path 
F-H-P in the Brillouin zone, (a) Computed at the full spin- 
orbit coupling strength A = 1. (b) Computed at the reduced 
strength A = 0.25. The peak marked with a star has a height 
of 5xl0 4 a.u. 



subpancl) in the vicinity of the the zone-boundary point 
H = ^(1,0,0), where three states, split by the spin- 
orbit interaction, lie just above the Fermi level. The large 
spike in the Berry curvature between the H and P points 
arises where two bands, split by the spin orbit interac- 
tion, lie on either side of the Fermi level. 15 This gives 
rise to small energy denominators, and hence large con- 
tributions, mainly in Eq. (31). On reducing the strength 
of the spin-orbit interaction as in Fig. 3(b), the energy 
separation between these bands is reduced, resulting in a 
significantly sharper and higher spike in the Berry curva- 
ture. A second type of sharp structure is visible in Fig. 3, 
where one can see two smaller spikes, one at about 40% 
and another at about 90% of the way from Y to H, which 
decrease in magnitude as the as the spin-orbit coupling 
strength is reduced. These arise from pairs of bands that 
straddle the Fermi energy even in the absence of spin- 
orbit interaction. Thus, the small spin-orbit coupling 
does not shift the energies of these bands significantly, 
but it does induce an appreciable Berry curvature that is 
roughly linear in the spin-orbit coupling. 

The decomposition of the total Berry curvature into its 
various contributions in Eq. (29) is illustrated by plot- 
ting the first ("0") term, the second and third ("D- 
A") terms, and the fourth ("D-D" or Kubo-like) term 




FIG. 4: Decomposition of the total Berry curvature into con- 
tributions coming from the three kinds of terms appearing in 
Eq. (29). The path in fc-space is the same as in Fig. 3. Dotted 
line is the first (Q.) term, dashed line is the sum of second and 
third (D-A) terms, and solid line is the fourth (D-D) term 
of Eq. (29). Note the log scale on the vertical axis. 



of Eq. (29) separately along the line r-H-P. Note the 
logarithmic scale. The results confirm the expectations 
expressed in Sees. Ill C and III D, namely, that the largest 
terms would be those reflecting large contributions to D 
arising from small energy denominators. Thus, the Q 
term remains small everywhere, the D-A terms become 
one or two orders of magnitude larger at places where 
small energy denominators occur, and the D-D term, 
Eq. (31), is another one or two orders larger in those 
same regions. Scans along other lines in /c-space reveal 
similar behavior. We may therefore expect that the D-D 
term will make the dominant overall contribution to the 
AHC. As we shall show in the next subsection, this is 
precisely the case. 

In order to get a better feel for the connection be- 
tween Fermi surface features and the Berry curvature, 
we next inspect these quantities on the k y = plane in 
the Brillouin zone, following Rcf. 15. In Fig. 5 we plot 
the intersection of the Fermi surface with this plane and 
indicate, using color coding, the S z component of the 
spin carried by the corresponding wavefunctions. The 
good agreement between the shape of the Fermi surface 
given here and in Fig. 3 of Ref. 15 is further evidence that 
the accuracy of our approach matches that of all-electron 
methods. It is evident that the presence of the spin-orbit 
interaction, in addition to the exchange splitting, is suffi- 
cient to remove all degeneracies on this plane, 31 changing 
significantly the connectivity of the Fermi surface. 

The calculated Berry curvature is shown in Fig. 6. It 
can be seen that the regions in which the Berry curvature 
is small (light green regions) fill most of the plane. The 
largest values occur at the places where two Fermi lines 
approach one another, consistent with the the discussion 
of Fig. 3. Of special importance are the avoided cross- 
ings between two bands having the same sign of spin, or 
between two bands of opposite spin. Examples of both 
kinds are visible in the figure, and both tend to give rise 
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FIG. 5: (Color online). Lines of intersection between the 
Fermi surface and the plane k y = 0. Colors indicate the S z 
spin-component of the states on the Fermi surface (in units 
of h/2). 
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FIG. 6: (Color online). Calculated total Berry curvature Q z 
in the plane k y = (note log scale). Intersections of the Fermi 
surface with this plane are again shown. 

to very large contributions in the region of the avoided 
crossing. Essentially, the spin-orbit interaction causes 
the character of these bands to change extremely rapidly 
with k near the avoided crossing; this is the origin of the 
large Berry curvature. The large contributions near the 
H points correspond to the peaks that were already men- 
tioned in the discussion of Fig. 3, resulting from mixing 
of nearly degenerate bands by the spin-orbit interaction. 



B. Integrated anomalous Hall conductivity 

We now discuss the computation of the AHC as an 
integral of the Berry curvature over the Brillouin zone, 



TABLE I: Convergence of AHC with respect to the density 
of the nominal fc-point mesh (left column) and the adaptive 
refinement scheme used to subdivide the mesh in regions of 
large contributions (middle column). 



fc-point mesh Adaptive refinement a (Q cm) 



200 


X 


200 


X 


200 


3x3x3 


774.55 


250 


X 


250 


X 


250 


3x3x3 


774.84 


320 


X 


320 


X 


320 


3x3x3 


775.80 


200 


X 


200 


X 


200 


5x5x5 


765.96 


250 


X 


250 


X 


250 


5x5x5 


766.37 


320 


X 


320 


X 


320 


5x5x5 


766.76 


200 


X 


200 


X 


200 


7x7x7 


763.87 


250 


X 


250 


X 


250 


7x7x7 


764.84 


320 


X 


320 


X 


320 


7x7x7 


765.10 


320 


X 


320 


X 


320 


9x9x9 


764.59 


320 


X 


320 


X 


320 


11 x 11 x 11 


764.37 


320 


X 


320 


X 


320 


13 x 13 x 13 


764.27 



Eq. (7). We first define a nominal N Q xN x N mesh that 
uniformly fills the Brillouin zone. We next reduce this to 
a sum over the irreducible wedge that fills ^th of the 
Brillouin zone, using the tetragonal point-group symme- 
try (broken from cubic by the onset of ferromagnetism) , 
and calculate f2 z on each mesh point using Eq. (29). Fi- 
nally, following Yao et al., we implement an adaptive 
mesh refinement scheme in which we identify those points 
of the A;-space mesh at which the computed Berry curva- 
ture exceeds a threshold value f2 cut , and recompute U, z 
on an N a x N a x N a submesh spanning the original cell 
associated with this mesh point. The AHC is then com- 
puted as a sum of fl z over this adaptively refined mesh 
with appropriate weights. 

The convergence of the AHC with respect to the choice 
of mesh is presented in Table I. We have chosen f2 cut = 
1.0 x 10 2 a.u., which causes the adaptive mesh refinement 
to be triggered at approximately 0.11% of the original 
mesh points. The value of 751 (fl cm) -1 reported previ- 
ously in Ref. 15 corresponds to a mesh similar to the one 
in the first line of Table I; our value of 775 (O cm) -1 for- 
tius mesh thus agrees to within a few percent with their 
value. Based on the results of Table I, we estimate the 
converged value of a to be 764 (f2 cm) -1 . 

It can be seen from Table I that a 200 x 200 x 200 
mesh with 3x3x3 refinement approaches within ~1% 
of the converged value. It is also evident that the level 
of refinement is more important than the fineness of the 
nominal mesh; a 200 x 200 x 200 mesh with 5x5x5 
adaptive refinement yields a result that is within 0.1% of 
the converged value, better than a 320 x 320 x 320 mesh 
with a lower level of refinement. 

It is interesting to decompose the total AHC into con- 
tributions coming from different parts of the Brillouin 
zone. For example, as we saw in Fig. 6, there is a smooth, 
low-intensity background that fills most of the volume of 
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TABLE II: Contributions to the AHC coming from different 
regions of the Brillouin zone, as defined in the text. 



AE (eV) 


like-spin (%) 


opposite-spin (%) 


smooth (%) 


0.1 


21 


26 


53 


0.2 


23 


51 


26 


0.5 


30 


68 


2 



1000r 



800 - 



600 - 



400- 



X 



200- 



thc Brillouin zone, and it is hard to know a priori whether 
the total AHC is dominated by these contributions or by 
the much larger ones concentrated in small regions. With 
this motivation, we have somewhat arbitrarily divided 
the Brillouin zone into three kinds of regions, which we 
label as 'smooth', 'like-spin', and 'opposite-spin'. To do 
this, we identify fc-points at which there is an occupied 
band in the interval [Ef — AE, Ef] and an unoccupied 
band in the interval [Ef,Ef + AE], where AE is arbi- 
trarily chosen to be a small energy such as 0.1, 0.2, or 
0.5 eV. If so, the fc-point is said to belong to the 'like-spin' 
or 'opposite-spin' region depending on whether the dom- 
inant characters of the two bands below and above the 
Fermi energy are of the same or of opposite spin. Other- 
wise, the Appoint is assigned to the 'smooth' region. As 
shown in Table II, the results depend strongly on the 
value of AE. Overall, what is clear is that the major 
contributions arise from the bands within ±0.5 eV of Ef, 
and that neither like-spin nor opposite-spin contributions 
are dominant. 

Next, we return to the discussion of the decomposition 
of the total Berry curvature in Eq. (29) into f2, D-A, 
and D-D terms. We find that these three kinds of terms 
account for -0.20%, 0.71%, and 99.48%, respectively, of 
the total AHC. (Similarly, for the alternative decomposi- 
tion of Appendix A, the second term of Eq. (A4) is found 
to be responsible for more than 99% of the total.) Thus, 
if a 1% accuracy is acceptable, one could actually neglect 
the ft and D-A terms entirely, and approximate the total 
AHC by the D-D (Kubo-like) terms alone, Eq. (31). 

While we had anticipated in Sec. IIIC that the D- 
D terms should be expected to dominate, the extent to 
which that occured in the actual calculation is somewhat 
surprising and merits further discussion. It is important 
to emphasize that this should not be expected to oc- 
cur when using an arbitrary Wannier representation, but 
only for WFs which minimize the spread functional. In- 
deed, only the sum of all terms in Eq. (29) is uniquely 
defined; taken separately, the ft, D-A, and D-D terms 
depend on the choice of gauge. Moreover, while the ft 
and D—A terms involve both the Hamiltonian and posi- 
tion matrix elements between WFs, the dominant D-D 
term only depends on the Hamiltonian matrix elements. 
Since the minimization of the gauge-dependent part of 
the spread functional corresponds precisely to minimiz- 
ing the RMS average magnitude of the position matrix 
element between WFs, 20 it is perhaps not too surprising 
that we capture most of the AHC by neglecting the terms 
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FIG. 7: Anomalous Hall conductivity vs. spin-orbit coupling 
strength. 



which involve position matrix elements. 

From a computational point of view, the fact that the 
D-D terms are fully specified by the Hamiltonian matrix 
elements alone means that considerable savings can be 
obtained by avoiding the evaluation of the Fourier trans- 
forms in Eqs. (36-37) at every interpolation point (and 
avoiding the setup of the matrix elements (On|r a |Rm), 
which can be costly in a real-space implementation). 
More importantly, this observation, if it turns out to hold 
for other materials as well, could prove to be important 
for future efforts to derive approximate schemes capa- 
ble of capturing the most important contributions to the 
AHC. 

Finally, we investigate how the total AHC depends 
upon the strength of the spin-orbit interaction, follow- 
ing the approach of Sec. IV C to modulate the spin-orbit 
strength. The result is shown in Fig. 7. We emphasize 
that our approach is a more specific test of the depen- 
dence upon spin-orbit strength than the one carried out 
in Ref. 15; there, the speed of light c was varied, which 
entails changing the strength of the various scalar rel- 
ativistic terms as well. Nevertheless, both studies lead 
to a similar conclusion: the variation is found to be lin- 
ear for small values of the spin-orbit coupling (A <C 1), 
while quadratic or other higher-order terms also become 
appreciable when the full interaction is included (A = 1). 



C. Computational Considerations 

The computational requirements for this scheme are 
quite modest. The self-consistent ground state calcula- 
tion and the construction of the WFs takes 2.5 hours on 
a single 2.2GHz AMD-Opteron processor. The expense 
of computing the AHC as a sum over interpolation mesh 
points depends strongly on the density of the mesh. On 
the same processor as above, the average CPU time to 
evaluate 2 on each fc-point was about 14 msec. We find 
that the mesh refinement operation does not significantly 
increase the total number of fc-point evaluations until the 
refinement level N a exceeds ~10. Allowing for the fact 



13 



that the calculation only needs to be done in the irre- 
ducible 4rth of the Brillouin zone, the cost for the AHC 
16 ' 

evaluation on a 200x200x200 mesh is about 2 hours. 

The CPU time per fc-point evaluation is dominated 
(roughly 90%) by the Fourier transform operations 
needed to construct the objects in Eqs. (34-37). The 
diagonalization of the 18x18 Hamiltonian matrix, and 
other operations needed to compute Eq. (29), account 
for only about 10% of the time. The CPU requirement 
for the Fourier transform step is roughly proportional to 
the number of R vectors kept in Eqs. (34-37); it is possi- 
ble that this number could be reduced by exploring more 
sophisticated methods for truncating the contributions 
coming from the more distant R vectors. 

Of course, the loop over fc-points in the AHC calcula- 
tion is trivial to parallelize, so for dense fc-meshes we 
speed up this stage of the calculation by distributing 
across multiple processors. 



VI. SUMMARY AND DISCUSSION 

In summary, we have developed an efficient method 
for computing the intrinsic contribution to the anomalous 
Hall conductivity of a metallic ferromagnet as a Brillouin- 
zone integral of the Berry curvature. Our approach is 
based on Wannier interpolation, a powerful technique for 
evaluating properties that require a very dense sampling 
of the Brillouin zone or Fermi surface. The key idea is 
to map the low-energy first-principles electronic struc- 
ture onto an "exact tight-binding model" in the basis of 
appropriately constructed Wannier functions, which are 
typically partially occupied. In the Wannier representa- 
tion the desired quantities can then be evaluated at arbi- 
trary fc-points at very low computational cost. All that 
is needed is to evaluate, once and for all, the Wannier- 
basis matrix elements of the Hamiltonian and a few other 
property-specific operators (namely, for the Berry curva- 
ture, the three Cartesian position operators). 

When evaluating the Berry curvature in this way, the 
summation over all unoccupied bands and the expensive 
calculation of the velocity matrix elements needed in the 
traditional Kubo formula are circumvented. 32 They are 
replaced by quantities defined strictly within the pro- 
jected space spanned by the WFs. Our final expression 
for the total Berry curvature, Eq. (29), consists of three 
types of terms, i.e., the fi, D— A, and D-D terms. 

We have applied this approach to calculate the AHC 
of bec Fe. While our Wannier interpolation formalism, 
with its decomposition (29), is entirely independent of the 
choice of an all-electron or pseudopotential method, we 
have chosen here a relativistic pseudopotential approach 
that includes scalar relativistic effects as well as the spin 
orbit interaction. We find that this scheme successfully 
reproduces the fine details of the electronic structure and 
of the Berry curvature in good agreement with a previous 
calculation 15 that used an all-electron LAPW method. 33 
The computed AHC is also quite close to that computed 



previously. 15 

Interestingly, we found that more than 99% of the to- 
tal Berry curvature is concentrated in the D-D term of 
our formalism. This term, given explicitly in Eq. (31), 
takes the form of a Kubo-likc Berry curvature formula for 
the "tight-binding states." Thus we arrive at the very ap- 
pealing result that a Kubo picture defined within the "ex- 
act tight-binding space" gives an excellent representation 
of the Berry curvature in the original ab-initio space. It 
is worth pointing out that, unlike the other three terms, 
this term depends exclusively on the Hamiltonian matrix 
elements between the Wannier orbitals, and not on the 
position matrix elements. This result merits further in- 
vestigation, and may be relevant for recent discussions 
in the tight-binding literature on how to incorporate the 
coupling to electromagnetic radiation in a tight-binding 
description. 19 ' 34 ~ 36 

Several directions for future studies suggest them- 
selves. For example, it would be desirable to obtain a bet- 
ter understanding of how the AHC depends on the weak 
spin-orbit interaction. As we have seen, this weak inter- 
action causes splittings and avoided crossings that give 
rise to very large Berry curvatures in very small regions 
of fc-space. There is a kind of paradox here. Our numer- 
ical tests, as in Fig. 7, demonstrate that the AHC falls 
smoothly to zero as the spin-orbit strength A is turned 
off, suggesting that a perturbation theory in A should be 
applicable. However, in the limit that A becomes small, 
the full calculation becomes more difficult, not less: the 
splittings occur in narrower and narrower regions of k- 
space, energy denominators become smaller, and Berry 
curvature contributions become larger (see Fig. 3), even 
if the integrated contribution is going to zero. It would be 
of considerable interest, therefore, to explore ways to re- 
formulate the perturbation theory in A so that the expan- 
sion coefficients can be computed in a robust and efficient 
fashion. Because the exchange splitting is much larger 
than the spin-orbit splitting, it may also be of use to in- 
troduce two separate couplings that control the strengths 
of the spin-flip and spin-conserving parts of the spin-orbit 
interaction respectively, and to work out the perturbation 
theory in these two couplings independently. 

Another promising direction is to explore whether the 
AHC can be computed as a Fermi-surface integral using 
the formulation of Haldane 11 in which an integration by 
parts is used to convert the volume integral of the Berry 
curvature to a Fermi-surface integral involving Berry cur- 
vatures or potentials. Such an approach promises to be 
more efficient than the volume- integration approach, pro- 
vided that a method can be developed for carrying out 
an appropriate sampling of the Fermi surface. This is 
likely to be a delicate problem, however, since the weak 
spin-orbit splitting causes Fermi sheets to separate and 
reattach in a complex way at short fc-scales, and the dom- 
inant contributions to the AHC are likely to come from 
precisely these portions of the reconstructed Fermi sur- 
face that are the most difficult to describe numerically 

In any case, even without such further developments, 
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the present approach is a powerful one. It reduces the ex- 
pense needed to do an extremely fine sampling of Fermi- 
surface properties to the level where the AHC of a mate- 
rial like bec Fe can be computed on a workstation in a few 
hours. This opens the door to realistic calculations of the 
intrinsic anomalous Hall conductivity of much more com- 
plex materials. More generally, the techniques developed 
here for the AHE are readily applicable to other prob- 
lems in the physics of metals which also require a very 
dense sampling of the Fermi surface or Brillouin zone. 
For example, an extension of these ideas to the evalua- 
tion of the electron-phonon coupling matrix elements by 
Wannier interpolation is currently under way. 40 
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APPENDIX A: ALTERNATIVE EXPRESSION 
FOR THE BERRY CURVATURE 

In this Appendix, we return to Eq. (27) and rewrite 
it in such a way that all of the large, rapidly varying 
contributions arising from small energy denominators in 
the expression for D a , Eq. (24), are segregated into a 
single term. We do this by solving Eq. (25) for D a and 
substituting into Eq. (27) to obtain 



o (H) - o (H) 

il af3 - "a/3 



A^^ 



+ i 



(Al) 



Then only the last term will contain the large, rapid vari- 
ations. This equation could have been anticipated based 
on the fact that the tensor 

fia/3 = - i[A a , Ap] (A2) 

is well known to be a gauge-covariant quantity; 20,37 ap- 
plying Eq. (21) to Q a p then leads directly to Eq. (Al). 
This formulation provides an alternative route to the 



calculation of the matrix Ct^J: 



evaluate in the 



Wannier representation using Eqs. (A5-A6) below, con- 
vert it to fi^g via Eq. (21), compute A^ using Eq. (25), 
and assemble 



rt3=&3 + i[AW,AW]. 



(A3) 



The large and rapid variations then appear only in the 
last term involving commutators of the A matrices. 

In Sec. HID, we showed how to write the total Berry 
curvature fi a| g(k) as a sum over bands in such a way that 
potentially troublesome contributions coming from small 
energy denominators between pairs of occupied bands are 



explicitly excluded, leading to Eq. (29). The correspond- 
ing expression based on Eq. (A3) is 

n Q/9 (k) = Y,f n n™ afi 

n 

+ J2^-f m )A^2, a A^- (A4) 



Now, in addition to the four quantities given in 
Eqs. (34-37), we need a corresponding equation for f2 aj g. 
After some manipulations, we find that 



E 

R 



e ik ' R w r , 



i/3 (R) 



(A5) 



where 

«Va/?(R.) 



-i^(On\r a \R.'m)(R'm\r \Rn) 

R/m 

+i^(0n|f / 3|R'm)(R'm|f a |Rn) . 



R/m 



(A6) 



This formulation again requires the same basic ingredi- 
ents as before, namely, the Wannier matrix elements of H 
and r a . In some respects it is a little more elegant than 
the formulation of Eq. (29). However, the direct evalu- 
ation of w n _ a p in the Wannier representation, as given 
in Eq. (A6), is not as convenient because of the extra 
sum over intermediate WFs appearing there; moreover, 
Wn.a/3 is longer-ranged than the Hamiltonian and coordi- 
nate matrix elements. Also, one appealing feature of the 
formulation of Section III, that more than 99% of the ef- 
fect can be recovered without using the position-operator 
matrix elements, is lost in this reformulation. We have 
therefore chosen to base our calculations and analysis on 
Eq. (29) instead. 

It is informative to obtain Eq. (A3) in a different way: 
define the gauge-invariant band projection operator 20 
Pk = Yln=i l u «k)(w„k| and its complement Qk = l- Pk- 
Inserting 1 = Qk + Pk into Eq. (18) in the Hamiltonian 
gauge then yields directly Eq. (A3) since, as can be easily 
verified, Eq. (A2) may be written as 



n nm , a /3 = i{d a u n \dpu m ) - i{df}U n \d a u ri 



(A7) 



where d a = Qd a . The gauge-covariance of f2 a/ g follows 
directly from the fact that d a is a gauge-covariant deriva- 
tive, in the sense that \d a u^} = Y.Z=i \9 a u^)U m „ is 
the same transformation law as Eq. (15) for the Bloch 
states themselves. It is apparent from this derivation that 
as the number M of WFs increases and Pk approaches 
1, the second term on the right-hand side of Eq. (A4) 
increases at the expense of the first term. Indeed, in the 
large- M limit the entire Berry curvature is contained in 
the second term. For the choice Wannier orbitals de- 
scribed in the main text for bec Fe, that term already 
accounts for 99.8% of the total AHC. 
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APPENDIX B: FINITE-DIFFERENCE 
APPROACH 

In this Appendix, we outline an alternative scheme for 
computing the AHC by Wannicr interpolation. The es- 
sential difference relative to to the approaches described 
in Section III and in Appendix A is that the needed fc- 
space derivatives are approximated here by finite differ- 
ences instead of being expressed analytically in the Wan- 
nier representation. 

This approach is most naturally applied to the zero- 
temperature limit where there are exactly At occu- 
pied states at a given k. Instead of starting from the 
Berry curvature of each individual band separately as 
in Eq. (6), we find it convenient here to work from the 
outset with the total Berry curvature 



fia/3(k) 



n=l 



(k) 



(Bl) 



of the occupied manifold at k (the zero-temperature limit 
of Eq. (19)). We now introduce a covariant deriva- 



tive d. 



W) 



Q { " k 'd a designed to act on the occu- 
i - pf k) and pf k) = 
J2n=i \ u nk)(u n k\- The only difference with respect to 
the definition of d a in Appendix A is that the projection 
operator here spans the TVk occupied states only, instead 
of the M states of the full projected space. Accordingly 
terms such as "gauge-covariance" and "gauge-invariancc" 
are to be understood here in a restricted sense. For exam- 
ple, the statement that is a gauge-covariant deriva- 
tive means that under an Ak x Ak unitary rotation W(k) 
between the occupied states at k it obeys the transfor- 
mation law 

l^unk) - £ \di N ^u mk )U mn (k). (B2) 

(We will use calligraphic symbols to distinguish Ak x Ak 
matrices such as U from their MxM counterparts such as 
U .) We now define a gauge-covariant curvature f2^ k ^(k) 

by replacing d by <9W) in Eq. (A7). Since the trace 
of a commutator vanishes, it follows from Eq. (A2) that 
Eq. (Bl) can be written as 



pied states only; here 



,JV k 



^a/3(k) = Tr 



(JV k 



(B3) 



where the symbol TrW) denotes the trace over the oc- 
cupied states. 

The advantage of this expression over Eq. (Bl) is that 
the covariant derivative of a Bloch state can be approxi- 
mated by a very robust finite-differences formula: 38,39 



9, 



(JVk) 



(JVk) 

k,b 



(B4) 



where the sum is over shells of neighboring fc-points, 20 
as in Eq. (41), and we have defined the gauge-invariant 
operator 



JV k 



(B5) 



in terms of the gauge-covariant "dual states" 



JVk 



i,k,b) — £ \u m ,k+b) (Qk+b,k), 



(B6) 



m— 1 



Here <2k+b,k is the inverse of the Ak x Ak overlap matrix, 



where 



Qk+b,k — (>?k,k+b) 



(<Sk,k+b)„ m = ( u nk|Wm,k+b) • 



(B7) 



(B8) 



The discretization (B4) is immune to arbitrary gauge 
phases and unitary rotations among the occupied states. 
Because of that property, the occurrence of band cross- 
ings and avoided crossings does not pose any special 
problems. 

Inserting Eqs. (B4-B8) into Eq. (B3) and using 

Qk,k+b = Qk+b k> we nn d tnat an appropriate finite- 
difference expression for the total Berry curvature is 



n 



wo™ -9 v 

a/3 



(k) = 2 > w hl w b2 6i, Q b 2 j3 A k ,bi,b 2 



bi ,b 2 



(B9) 



where 



Ak,bi,b 2 = -ImTr'" 11 ' [Qk,k+bi«Sk+bi,k+b 2 Qk+b 2 ,k] • 

(BIO) 

This expression is manifestly gauge-invariant, since both 
S and <2 are gauge-covariant matrices, i.e., 5k,k+b * 
W (k)<Sk,k+b^(k + b), and the same transformation law 
holds for Q k ,k+b- 

Eqs. (B9-B10) can be evaluated at an arbitrary point 
k once the overlap matrices <Sk,k+b are known. For that 
purpose we construct a uniform mesh of spacing Afc in 
the immediate vicinity of k, set up the needed shells 
of neighboring fc-points k + b on that local mesh, and 
then evaluate 5k,k+b by Wannier interpolation. Since 
the WFs span the entire M-dimensional projected space, 
at this stage we revert to the full MxM overlap ma- 
trices Sk,k+b- I n the Wannier gauge they are given by a 
Fourier transform of the form 

(S%U) = E e * k - R <0^* MR - f) |Hm> . (Bll) 

V / run ~~ 

R, 

For sufficiently small Afc, this can be approximated as 



l^k.k+bj, 



ibV e lk - R (On|f|Rm) . (B12) 



R 
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Note that the dependence of the last expression on AA; is 
trivial, since it only enter as a multiplicative prefactor. In 
practice one chooses Afc to be quite small, ~ 1CP 6 a.u. -1 , 
so as to reduce the error of the finite-differences expres- 
sion. 

In the Wannier gauge the occupied and empty states 
are mixed with one another, because the WFs are par- 
tially occupied. In order to decouple the two subspaces 
we perform the unitary transformation 

^S + b = ^(k)4T+b^(k + b). (B13) 

This produces the full M x M overlap matrix in the 
Hamiltonian gauge. The x A" k submatrix in the up- 
per left corner is precisely the matrix «Sj^ +b needed in 
Eq. (BIO). 

Like the approach described in the main text, this ap- 
proach still only requires the WF matrix elements of the 
four operators H and f a {a — x, y, and z). We have 
implemented it, and have checked that the results agree 
closely with those obtained using using the method of the 
main text. Although not as elegant, this approach has 
the interesting feature of circumventing the evaluation of 



the matrix D { ' , Eq. (24). This may be advantageous in 
certain special situations. For example, if a parameter 
such as pressure is tuned in such a way that a /c-space 
Dirac monopolc 14 drifts to the Fermi surface, the vanish- 
ing of the energy denominator in Eq. (24) may result in 
a numerical instability when trying to find the monopolc 
contribution to the AHC. 

We conclude by noting that Eq. (BIO) is but one of 
many possible finite-differences expressions, and may not 
even be the most convenient one to use in practice. By 
recalling that the Berry curvature is the Berry phase per 
unit area, one realizes that in the small-Afc limit of inter- 
est, the quantity A kj bi,b 2 m Eq. (B9) can be viewed as 
the discrete Berry phase <j) accumulated along the small 
loop k — ► k + bi — ► k + b2 — > k. As is well-known, the 
Berry phase around a discrete loop is defined as 13 

<f> = - Im lndet [5 kik+ b 1 <Sk+b 1 ,k+b 2 5 k+ b 2 ,k] • (B14) 

It can be shown that = A k ,bi,b 2 + 0(Ak 2 ), so that 
for small loops the two formulas agree. Eq. (B14) has 
the practical advantage over Eq. (BIO) that it does not 
require inverting the overlap matrix. 
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